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1. CONFIDENCE REGION ESTIMATION 

The author has written an interesting article on 
the relationship of confidence distribution and Baye- 
sian posterior distribution. Confidence distribution 
has its origin from Fisher's fiducial distribution, and 
in this discussion we refer to it simply as the "confi- 
dence distribution approach." It allows frequentists 
to assign confidence intervals (or, more generally, 
confidence regions) to the outcome of estimation 
procedures. 

The idea can be simply described as follows. Con- 
sider a statistical model with a family of distribu- 
tions pe(y), where y is the observation and 9 is the 
model parameter. We assume that the observed y is 
generated according to a true parameter 6* which is 
unknown to the statistician. If we can find a real- 
valued quantity U(y; 9) that depends on 9 and y 
such that for all 9, when y is generated from pe(y), 
U(y;9) is uniformly distributed in (0, 1), then we can 
estimate the confidence interval of 9 given an obser- 
vation y as the set I a ,p{y) = {9 :U (y; 9) £ (a, /?)} for 
some < a < /3 < 1. An interpretation of this confi- 
dence region is that no matter what is the true un- 
derlying 9* that generates y, the region I a ,p{y) con- 
tains the true parameter with probability /3 — a 
(when y is generated according to 9*). 

Indeed, the above interpretation is a very natu- 
ral definition of confidence region in the frequentist 
setting. It does not assume that 9* is generated ac- 
cording to any prior, and the interpretation holds 
universally true for all possible 9* in the model. This 
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interpretation can be compared to a confidence re- 
gion from the Bayesian posterior calculation that 
assumes that 9* is generated according to a specific 
prior which has to be known to the statistician. If 
the statistician chooses the wrong prior, then the 
confidence region calculated from the Bayesian ap- 
proach will be incorrect in that it may not contain 
the true parameter 9* with the correct probabil- 
ity. 

The paper takes this interpretation of confidence 
region, and goes on to provide several examples show- 
ing that the Bayesian approach does not lead to cor- 
rect confidence estimates for all . The author then 
argued that the confidence distribution approach is 
the more "correct" method for obtaining confidence 
intervals and the Bayesian approach is just a quick 
and dirty approximation. 

One question that needs to be addressed in the 
confidence distribution approach is how to construct 
a statistics U(yo; 8) with the desired property. The au- 
thor considered the quantity U(y ; 9) = f y < yo p g (y) dy, 
which is well-defined if the observation y is a real- 
valued number. This corresponds to the proposal 
in Fisher's fiducial distribution. The idea of fidu- 
cial distribution received a number of discussions 
throughout the years, and is known to be adequate 
for unconstrained location families (for which the 
fiducial confidence distribution matches the Bayesian 
confidence distribution using a flat prior). However, 
the general concept is controversial, and largely re- 
garded as a major blunder by Fisher. 

In this discussion article we will explain why the 
idea of confidence distribution with 



U(y ;8) 



Pe(y)dy 



y<yo 



has not received more attention for general statisti- 
cal estimation problems, although it does give con- 
fidence region estimates that fit the frequentist in- 
tuition. 
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2. SUBOPTIMALITY 

The purpose of confidence distribution is to pro- 
vide a confidence region that is consistent with the 
frequentist definition. However, one flaw of this ap- 
proach is that the result it produces may not be 
optimal. While this issue was pointed out in the ar- 
ticle, it was not explicitly discussed. In my opinion, 
this is the main reason why the idea of confidence 
distribution hasn't become more popular in statis- 
tics. Therefore, this section provides a more detailed 
discussion on this issue. 

To understand this point, we shall first consider 
a simple illustration. Let U(y, 9) be a uniform ran- 
dom variable in (0, 1) that is independent of y and 9. 
By definition, given any the confidence region 
Ia,p(y) = : U(y; 9) G (a, (3)} contains 9* with prob- 
ability (5 — a. Since this applies to the parameter 
that generates y, the confidence region obtained this 
way is consistent with the frequentist intuition of 
what a confidence region should mean. However, 
this estimate is not useful statistically because the 
method just randomly guesses either the entire do- 
main of 9 when U G (at,f3) or the empty region oth- 
erwise; the decision does not even depend on y. 

While the above example is extreme, it does show 
that a confidence region merely consistent with the 
frequentist semantics is not necessarily a useful es- 
timate. Statistically, this is because the confidence 
region obtained is suboptimal. In fact, this claim 
also applies to the confidence distribution approach 
this article considers. More specifically, for nonlin- 
ear problems that this paper focused on, the method 
can produce confidence regions that are quite sub- 
optimal. By "optimal" (or even "good"), we mean 
that the confidence region a method produces should 
be small by some measure. In particular, if another 
method provides confidence regions that also fit in 
the frequentist semantics but is no larger on average 
for all 9 and smaller for some 9, then it can be re- 
garded as a better method. This corresponds to the 
notion of admissibility in decision theory. 

Consider the following simple nonlinear location 
estimation model: y is generated either from N(0, <Tq) 
when 9 = 0, or from N(l, a\) when 9 = 1. There are 
only two possible positions 9 = or 9 = 1 for the 
unknown location parameter 9, and we assume that 
the variance parameters <Tq and erf are known quan- 
tities that are not necessarily equal. Note that the 
restriction of 9 to two positions is only for simplic- 
ity, which is not critical for our illustration — we can 
extend the example to allow all locations in R. 



For this example, the confidence distribution ap- 
proach gives the following U(yo,9): 



U(y ,9) 



<S>(y /a ), 9 = 0, 

*((i/D-l)M), 9 = 1, 



where $(z) denotes the cdf of the standard Gaussian 
N(0,1). 

Let's consider the confidence region Is,i-s(y) for 
some 5 G (0,0.25), which we simplify as I(y). By 
definition, the estimated confidence region I{y) con- 
tains the position 9 = if and only if y G Q,q with 
O = (a ^~ 1 (5),-a ^~ 1 (5)), and I(y) contains the 
position 9 = 1 if and only if y G fii with fii = (1 + 
<7i<E» _1 (c>), 1 — o"i$ _1 ((5)). For convenience, we also 
define 



f i = p( y en 1 \9 = o) 

/•l-o-l*- 1 ^) 
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dy. 



In order to show that the confidence distribution ap- 
proach is suboptimal, we can, for simplicity, consider 
the case oq S> 1 and o\ <S 1, so that 1 — <7i<i> -1 (<5) < 
— ao^~ 1 (5) and /j,q < 25. The first condition implies 
that f^i C Oo- Therefore, when the parameter 9 = 1, 
with probability 1 — P(y G Ui\9 = 1) = 1 — 26 over 
y ~ N(l,af), we have y G and, thus, \I(y)\ = 2 
[i.e., I(y) contains both 9 = and 9 = 1]. Therefore, 
we have (note that we have assumed that 5 < 0.25) 

(1) E yle=1 \l(y)\>2(l-25)>1. 

Moreover, we have 

E yle=0 \I(y)\ = P(y G n \9 = 0) + P(y G Q\\9 = 0) 

= 1 - 2<5 + /i . 

Now we would like to construct a better confidence 
region estimator by using the condition (which we 
made earlier) that P(y G Q\\9 = 0) = no < 25. There- 
fore, we can pick Qq such that Q' f] Q\ = and 
P{y G Oq|0 = 0) = 1 — 25. This means that we can 
choose the following confidence region estimate I'(y): 
I'(y) contains the position 9 = if and only if y G 0,'q 
and I'(y) contains the position 9 = 1 if and only if 
y G fii. This estimate obeys the frequentist defini- 
tion because P{9 G l'{y)\0) = 1-25 both when 9 = 
and 9 = 1. Moreover, we have 

E y \0=o\l'(y)\ =1-28 + 1*0, E yle=1 \l'(y)\ < 1. 

The second inequality is due to the fact that |i"'(y)| < 
1 for all y because Q'qDQi = 0. In comparison to (1), 
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we know that when 9 = 1, the confidence distribu- 
tion approach gives a confidence region I(y) with 
a larger average size. This means that for this simple 
problem, the confidence distribution approach gives 
a suboptimal estimate of confidence region I(y) that 
is dominated by a better method I'(y)- The differ- 
ence can be significant when 5 ~ 0. 

3. CONCLUSION 

The confidence distribution approach is a rather 
general method to obtain confidence regions for pa- 
rameter estimation problems consistent with the fre- 
quentist semantics. The method can also be easily 
generalized to the multivariate situation where y is 
a vector instead of a real number. Nevertheless, the 
confidence region it estimates can be rather subop- 
timal in the sense that the region obtained by this 
method can be significantly larger than what can be 
done with more sophisticated methods. Although we 
have only illustrated this phenomenon with a rel- 
atively simple example, the conclusion holds more 
generally. 

At the root of this suboptimality, we note that 
whether a model parameter 9q belongs to the confi- 
dence region obtained by the confidence distribution 
approach only depends on the distribution p(y\9 = 
9q) at the parameter 9$ itself, without considering 



the alternative models at 9 ^ 9q. This unnatural be- 
havior is what causes its suboptimality for general 
nonlinear models. For example, in order to achieve 
good performance for the simple two-position loca- 
tion estimation example given in the previous sec- 
tion, the confidence region estimate I'{y) at 9 = 
has to be modified in order to take advantage of 
the alternative model 9 = 1 (so that fig n = 0). 
Such adaptation does not occur in the confidence 
distribution approach. As noted by the author dur- 
ing the discussion of the bounded parameter exam- 
ple, the confidence distribution estimate does not 
change when we restrict the model space, and this 
phenomenon is rather odd. The author dismissed 
this problem ELS £1 secondary issue because it does 
not change the semantics of the confidence region 
in the frequentist interpretation. However, if we are 
interested in achieving (near) optimality for the es- 
timated confidence region, then this issue becomes a 
more serious concern because it means that this sim- 
ple method ignores a significant amount of available 
information that could have been used in more com- 
plicated algorithms. In conclusion, while the confi- 
dence distribution approach is simple to apply, the 
simplicity is achieved by ignoring some useful infor- 
mation. Therefore, we have to keep the limitations 
of this method in mind whenever it is applied to 
complex statistical models. 



